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Abstract: An association rule mining is important in data mining. Two Steps important in association rule 
mining. First, find the frequent itemset from dataset and Second, find the association rule from frequent 
itemsets. A frequent itemsets mining is crucial and most expensive step in association rule mining. In Apriori 
and Apriori-like principle it's known that the algorithms cannot perform efficiently due to high and repeatedly 
database passes. In this paper we proposed a improved technique for frequent itemset mining. This technique 
scan the database only once and reduces the number of transaction. 
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I. INTRODUCTION 

Data mining is applicable to real data like industry, textile showroom, super market etc. Association 
rule is one of the data mining technique is used to generate association rules. The association rule is used to find 
the frequent item sets from the large data. Frequent patterns are patterns (i.e. itemsets) that appears in a dataset 
frequently. A set of items, i.e. computer and antivirus that appears frequently together in a transaction dataset is 
a frequent itemsets. Frequent patterns mining like Frequent itemsets find frequent itemsets from the small 
database and/or large database, where the database are either transactional or relational. The frequent itemset 
mining is the process of finding out frequent itemsets from the DB. 

Apriori and FP-Growth are known to be the two important algorithms each having different approaches in 
finding frequent itemsets[l][2]. The Apriori Algorithm uses Apriori Property in order to improve the efficiency 
of the level-wise generation of frequent itemsets. On the other hand, the drawbacks of the algorithm are 
candidate generation and multiple database scans. FP-Growth comes with an approach that creates signatures of 
transactions on a tree structure to eliminate the need of database scans and outperforms compared to Apriori [2]. 
In this paper, section 2 discuss the review of our work; section 3 we proposed an improved technique for 
frequent itemset mining; section 4 discussion about improved technique; Finally section 5 concludes the paper. 

II. RELATED WORK 

2.1 Support 

Support is the ratio of the number of transactions that include all items in the antecedent and consequent parts of 
the rule to the total number of transactions. Support is an association rule interestingness measure. 



Support = P ( A -■ B) = 



Number of transactions containing both A and B 
Total no of transactions 



A and B represents a itemsets in a Database D. 
2.2 Confidence 

Confidence is the ratio of the number of transactions that include all items in the consequent as well as 
antecedent to the number of transactions that include all items in antecedent. Confidence is an association rule 
interestingness measure. 



Confidence = P (B A) = P ( A ) 



Number oftransactionscontainingbothA andB 



Number of transaction containing A 
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Apriori algorithm introduced by Agrawal 1994[1], is a first frequent pattern and association rule 
mining algorithm. Apriori algorithm used to find the all frequent itemset and its association rules, the 
algorithm use breadth-first (level-wise) search method which is known as iterative approach. The key feature 
of Apriori algorithm is to make multiple database passes. Two main disadvantages of the Apriori algorithm, 
more time scan the database and huge candidate itemset generate. 

Studies various frequent itemset mining algorithm has been introduced to solve the drawback of Apriori 
algorithm. First, More times scan the database. Second, Generate huge candidate itemset. Use a different data 
structure and other techniques, solve the problem to has been introduced various FIM alorithm. CBT-fi for 
Mining Frequent Itemsets[9] this technique Reduce the transaction, scan database only once, use less amount 
of memory; Index-BitTableFI: An improved algorithm for mining frequent itemsets[4] this technique similar 
transaction greatly, search space is reduced greatly; Mining frequent itemsets in large databases: The 
hierarchical partitioning approach[7] this technique there is no extra cost of re-scanning the original database 
and memory based algorithm for large database; An Improved Apriori Algorithm based on Matrix Data 
Structure[10] this technique scan database only once, works Top-bottom approach, reduce input/output cost; A 
Semi-Apriori Algorithm for Discovering the Frequent Itemsets[ll] technique reduce candidate itemset and 
reduce total number of database pass;, An Association Rule Mining Algorithm Based On A Boolean Matrix[3] 
which technique database only once and not produce the candidate itemset; Improving the efficiency of 
Apriori Algorithm in Data Mining[8] which technique reduce candidate itemset and reduce the input- output 
cost; 

IH. AN IMPROVED TECHNIQUE 

Following steps improved technique for frequent itemset mining. 
Steps: 

Step 1: Given transaction DB and minimum support. 

Step 2: Add Count Column in M b_matrix. i.e Count represent the size of the row. And CC represent the count 
the number of 1 in every column. 

Step 3: Delete infrequent items based on min_supp. (if CC< min_supp then remove the items column ); and 

Rearrange b_matrix in descending order based on Count. 

Step 4: Count distinct row and store the count value in TC in M b_matrix 

Step 5: For each transaction T in M b_matrix, If (TC>=min_supp) Extract itemset its frequent move items with 
subset into FAL; then Remove the T; store the count value in TC in M' bmatrix. 

Step 6: For each transaction T in M' b matrix; Extract itemset and check into FAL if its present in FAL do not 
need AND operation else do AND operation; If (other respectively row count value grater then or equal to own 
count value) do AND operation; Results in same itemset structure as processing row's itemset structure then 
increase its support count value. Check support count value grater then or equal to minimum support then 
extract items its frequent and store into FAL. 

M=Matrix ,TC= Transaction Count, FAL= Frequent Array List, CC= column count; 
Pseudo Code: 
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Scan DB and convert into b _matrix 




Rearrange b_matrix in descending order based on 


For each column Mi of M 




Count. 






if sum(Mi) < min_supp 




Count distinct row store 


the count value in TC in M 


delete Mi from M; 




b_matrix. 






For each column Mr of M 




For each transaction T 1 


n M b_matr 


x do begin 


Count <— sum(Mr) 




count number of 1 


n each trans 


action store in 


Sort count in descending order 




count column 






if disctingrows in M 




end 






rearrange M < — count distingrows i 


iTC 


For each transaction T i 


n M bmatr 




For each T in M do begin 




If (TC>=min_supp) 






count <— sum(Mr) 




Extract itemset i 


s frequent n 




end 




subset into FAL; 






write M, count, TC 




then Remove the T; 






For each T in M 










if(TC>=min_supp) 




write M' b matrix 






Extract itemset its frequent move itm 


es with subset 








into FAL; 




write update M' b r 




act itemset with 


then Remove the T; 




subset in 






else 




to FAL 






write M' 




For each transaction T 1 


n M' b mat 








Extract itemset and check in 


oFAL 


write update M'. extrract itemset with subset into 


if its present in FAL do not need AND 


FAL; 




operation 






For each T in M' 




else do AND operation; 




Extract itemset and check into FAL 




If (other respectively row cou 




if its present in FAL do not need AND operation 








else do AND operation; 




do AND operatio 






If (other respectively row cou 




Results in same i 






value) 




processing 






do AND operation 






cture then i 




Results in same itemset struct 


are as processing 


support 






row's itemset structure then ir 




count value; 






support 




if ( support count 




n_supp) 






extract items its frequent mov 








subset into FAL; 







Example: 

Step 1: Above show the steps procedure step by step description improved technique. Here min_supp = 3; 
Table 1 



TID 


Items 


Tl 


11,12,13,15 


T2 


12,13 


T3 


12,13,14 


T4 


11,13,15 


T5 


11,12,13 


T6 


11,13,15 



Step 2: Add count column in Table 1 . 
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Table 2 
12 I 13 



Step 3: Remove the Items 14 based on min_supp from Tab 2 and rearrange M b_matrix based on descending- 
order from Tab 3; 

Table 3 Table 4 



TID/I 


11 


12 


13 


15 


count 


Tl 


1 


1 




1 


4 


T2 


0 


1 




0 


2 


T3 


0 


1 




0 


2 


T4 


1 


0 




1 


3 


T5 


1 


1 




0 


3 


T6 


1 


0 




1 


3 



Step 4: In M b_matrix; If (TC>=min_supp) Extract itemset its frequent move items with subset into FAL from 
Table 4; FAL={(I1,I3,I5),(I2,I3),(I1,I3),(H,I5),(I3,I5),(H),(I2), (I3),(I5)} 

Table 5 



Step 5: Generate M' bmatrix from Table 5 and used the M' bmatrix find the frequent itemset. 

Table 6 



TID/I II 12 



15 count TC 



T5 



1 1 



1 



0 



1 



Step 6: 

Select First Row Tl and extract itemsjll, 12,13, 15 }, check FAL if items is not present in FAL do need 
AND operation. Results in same itemset structure as processing row's itemset structure then increase its support 
count value. Here result in not same itemset structure as processing row's itemset structure. And move to next 
row. 

Select First Row T5 and extract items {1 1,12,13}, check FAL if items is not present in FAL do need AND 
operation and calculate support count value is equal to 4. If support count value is greater than or equal to 
min_supp. Then extract item its frequent and move items with its subset into FAL. Move items (11,12,13) with 
subset into FAL. 

FAL = {(I1,I3,I5),(I2,I3),(I1,I3),(I1,I5),(I3,I5),(I1),(I2), (T3),(I5), (11,12,13), (11,12)} 



IV. DISCUSSION 

Apriori algorithm used for extracting frequent itemsets faces two main disadvantages. Firstly 
it scans the database multiple times and secondly it generates large number of irregular itemsets hence increases 
spatial and temporal complexities and overall decreases the efficiency of classical apriori algorithm use to a 
our improve technique for frequent itemset mining. Reduce the execution time compare to the Apriori 
algorithm. An improved technique that can be used resolve the problem apriori algorithm. 
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V. CONCLUSION 

We can conclude reduce the do AND operation and also compress data structure & find out frequent 
itemset. Reduce the transaction and input/output cost. Also find the frequent itemset from largest frequent 
itemset to smallest frequent itemset. Only one time scan the original database. 
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